1,195 research outputs found
X-CNN: Cross-modal Convolutional Neural Networks for Sparse Datasets
In this paper we propose cross-modal convolutional neural networks (X-CNNs),
a novel biologically inspired type of CNN architectures, treating gradient
descent-specialised CNNs as individual units of processing in a larger-scale
network topology, while allowing for unconstrained information flow and/or
weight sharing between analogous hidden layers of the network---thus
generalising the already well-established concept of neural network ensembles
(where information typically may flow only between the output layers of the
individual networks). The constituent networks are individually designed to
learn the output function on their own subset of the input data, after which
cross-connections between them are introduced after each pooling operation to
periodically allow for information exchange between them. This injection of
knowledge into a model (by prior partition of the input data through domain
knowledge or unsupervised methods) is expected to yield greatest returns in
sparse data environments, which are typically less suitable for training CNNs.
For evaluation purposes, we have compared a standard four-layer CNN as well as
a sophisticated FitNet4 architecture against their cross-modal variants on the
CIFAR-10 and CIFAR-100 datasets with differing percentages of the training data
being removed, and find that at lower levels of data availability, the X-CNNs
significantly outperform their baselines (typically providing a 2--6% benefit,
depending on the dataset size and whether data augmentation is used), while
still maintaining an edge on all of the full dataset tests.Comment: To appear in the 7th IEEE Symposium Series on Computational
Intelligence (IEEE SSCI 2016), 8 pages, 6 figures. Minor revisions, in
response to reviewers' comment
EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices
In recent years, advances in deep learning have resulted in unprecedented
leaps in diverse tasks spanning from speech and object recognition to context
awareness and health monitoring. As a result, an increasing number of
AI-enabled applications are being developed targeting ubiquitous and mobile
devices. While deep neural networks (DNNs) are getting bigger and more complex,
they also impose a heavy computational and energy burden on the host devices,
which has led to the integration of various specialized processors in commodity
devices. Given the broad range of competing DNN architectures and the
heterogeneity of the target hardware, there is an emerging need to understand
the compatibility between DNN-platform pairs and the expected performance
benefits on each platform. This work attempts to demystify this landscape by
systematically evaluating a collection of state-of-the-art DNNs on a wide
variety of commodity devices. In this respect, we identify potential
bottlenecks in each architecture and provide important guidelines that can
assist the community in the co-design of more efficient DNNs and accelerators.Comment: Accepted at MobiSys 2019: 3rd International Workshop on Embedded and
Mobile Deep Learning (EMDL), 201
FDAPT: Federated Domain-adaptive Pre-training for Language Models
Combining Domain-adaptive Pre-training (DAPT) with Federated Learning (FL)
can enhance model adaptation by leveraging more sensitive and distributed data
while preserving data privacy. However, few studies have focused on this
method. Therefore, we conduct the first comprehensive empirical study to
evaluate the performance of Federated Domain-adaptive Pre-training (FDAPT). We
demonstrate that FDAPT can maintain competitive downstream task performance to
the centralized baseline in both IID and non-IID situations. Furthermore, we
propose a novel algorithm, Frozen Federated Domain-adaptive Pre-training
(FFDAPT). FFDAPT improves the computational efficiency by 12.1% on average and
exhibits similar downstream task performance to standard FDAPT, with general
performance fluctuations remaining less than 1%. Finally, through a critical
evaluation of our work, we identify promising future research directions for
this new research area.Comment: 6 page
Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices
Convolutional Neural Networks (CNNs) have revolutionized the research in
computer vision, due to their ability to capture complex patterns, resulting in
high inference accuracies. However, the increasingly complex nature of these
neural networks means that they are particularly suited for server computers
with powerful GPUs. We envision that deep learning applications will be
eventually and widely deployed on mobile devices, e.g., smartphones,
self-driving cars, and drones. Therefore, in this paper, we aim to understand
the resource requirements (time, memory) of CNNs on mobile devices. First, by
deploying several popular CNNs on mobile CPUs and GPUs, we measure and analyze
the performance and resource usage for every layer of the CNNs. Our findings
point out the potential ways of optimizing the performance on mobile devices.
Second, we model the resource requirements of the different CNN computations.
Finally, based on the measurement, pro ling, and modeling, we build and
evaluate our modeling tool, Augur, which takes a CNN configuration (descriptor)
as the input and estimates the compute time and resource usage of the CNN, to
give insights about whether and how e ciently a CNN can be run on a given
mobile platform. In doing so Augur tackles several challenges: (i) how to
overcome pro ling and measurement overhead; (ii) how to capture the variance in
different mobile platforms with different processors, memory, and cache sizes;
and (iii) how to account for the variance in the number, type and size of
layers of the different CNN configurations
Learning Bodily and Temporal Attention in Protective Movement Behavior Detection
For people with chronic pain, the assessment of protective behavior during
physical functioning is essential to understand their subjective pain-related
experiences (e.g., fear and anxiety toward pain and injury) and how they deal
with such experiences (avoidance or reliance on specific body joints), with the
ultimate goal of guiding intervention. Advances in deep learning (DL) can
enable the development of such intervention. Using the EmoPain MoCap dataset,
we investigate how attention-based DL architectures can be used to improve the
detection of protective behavior by capturing the most informative temporal and
body configurational cues characterizing specific movements and the strategies
used to perform them. We propose an end-to-end deep learning architecture named
BodyAttentionNet (BANet). BANet is designed to learn temporal and bodily parts
that are more informative to the detection of protective behavior. The approach
addresses the variety of ways people execute a movement (including healthy
people) independently of the type of movement analyzed. Through extensive
comparison experiments with other state-of-the-art machine learning techniques
used with motion capture data, we show statistically significant improvements
achieved by using these attention mechanisms. In addition, the BANet
architecture requires a much lower number of parameters than the state of the
art for comparable if not higher performances.Comment: 7 pages, 3 figures, 2 tables, code available, accepted in ACII 201
- …